基于AdaBoost的不完整数据的信息熵分类算法

doi:10.3969/j.issn.1006-2475.2013.09.007

计算机与现代化 ›› 2013, Vol. 1 ›› Issue (9): 31-34.doi: 10.3969/j.issn.1006-2475.2013.09.007

基于AdaBoost的不完整数据的信息熵分类算法

吕靖¹，舒礼莲²

1.安徽大学计算机科学与技术学院，安徽合肥 230601；2.江西省计算技术研究所，江西南昌 330002

收稿日期:2013-03-29 修回日期:1900-01-01 出版日期:2013-09-17 发布日期:2013-09-17

Incomplete Data Information Entropy Classification Algorithm Based on AdaBoost

LYU Jing¹, SHU Li-lian²

1. School of Computer Science and Technology, Anhui University, Hefei 230601, China;2. Jiangxi Institute of Computing Technology, Nanchang 330002, China

Received:2013-03-29 Revised:1900-01-01 Online:2013-09-17 Published:2013-09-17

摘要/Abstract

摘要： 目前，针对不完整数据的集成分类算法没有考虑缺失属性之间的差异，在衡量各个子分类器的权值时仅仅考虑了数据集的大小以及包含属性的多少，并没有考虑各个数据子集之间属性的差异度。本文利用信息熵对各个子数据集的重要程度进行量化，进而评估从该数据集构建出的分类器的权值，使得在最终的加权投票过程更加公平，最终结果更加准确。使用基于multi-class AdaBoost的集成分类算法，以BP算法为基础分类器，对来自UCI的数据集进行实验，实验结果表明该算法在一定程度上提高了不完整数据的分类精度。

关键词: multi-class AdaBoost, 信息熵, 不完整数据, 集成分类

Abstract: At present, the ensemble classification algorithms for incomplete data do not consider the differences among attributes. They weight the sub-classifiers just using the size and the dimension of sub-dataset. In this paper, the information entropy is used to quantify the differences among various sub-datasets, and then the weights for each sub-classifier are computed. So, the weighted voting is fairer, and the prediction accuracy is higher. Experiments on UCI datasets with base classifier of BP show that the proposed algorithm is better than the algorithm using simple weight.

Key words: multi-class AdaBoost, information entropy, incomplete data, ensemble classification

中图分类号:

TP181

吕靖;舒礼莲. 基于AdaBoost的不完整数据的信息熵分类算法[J]. 计算机与现代化, 2013, 1(9): 31-34.

LYU Jing;SHU Li-lian. Incomplete Data Information Entropy Classification Algorithm Based on AdaBoost[J]. Computer and Modernization, 2013, 1(9): 31-34.

[1]	黄皓, 陈荔. 基于信息熵和改进相似度协同过滤算法[J]. 计算机与现代化, 2021, 0(06): 29-34.
[2]	张帅, 杨雪霞. 一种基于熵理论的自适应水印嵌入算法[J]. 计算机与现代化, 2020, 0(09): 37-42.
[3]	张羽1,2，郭春1,2，申国伟1,2，平源3. 一种基于信息熵的IDS告警预处理方法[J]. 计算机与现代化, 2020, 0(05): 111-.
[4]	刘冰1,2，刘雪梅3. 一种结合有限域运算的混沌映射在数字图像加密中的应用[J]. 计算机与现代化, 2017, 0(7): 53-56.
[5]	高亮1，潘积远2，于佳平3. 多元线性回归在非负矩阵分解人脸识别中的应用[J]. 计算机与现代化, 2017, 0(11): 41-45+54.
[6]	潘乔，许腾，陈德华，徐光伟. 一种基于旋转森林的甲状腺疾病分类方法[J]. 计算机与现代化, 2016, 0(3): 11-15.
[7]	郭文锋1,王勇2. 基于累积正样本的偏斜数据流集成分类方法[J]. 计算机与现代化, 2015, 0(3): 41-47.
[8]	徐斌;薛广涛. 一种实时细粒度的用户级网络流量探知方法[J]. 计算机与现代化, 2013, 1(4): 162-165,.
[9]	庞威;吕晓峰;马羚;邓力. 基于启发式搜索图的测试点决策方法[J]. 计算机与现代化, 2013, 218(10): 200-203.
[10]	徐丹;徐明;左欣. 集成SVM在微阵列数据分析中的应用 [J]. 计算机与现代化, 2011, 1(5): 4-6.
[11]	孔丽英. 基于信息熵和聚类分析的评价模型[J]. 计算机与现代化, 2010, 1(3): 62-3.

基于AdaBoost的不完整数据的信息熵分类算法

Incomplete Data Information Entropy Classification Algorithm Based on AdaBoost

可视化

被引次数

摘要/Abstract

引用本文

使用本文

参考文献

相关文章 11

编辑推荐

Metrics

本文评价